2,120 research outputs found
Top-Down Induction of Decision Trees: Rigorous Guarantees and Inherent Limitations
Consider the following heuristic for building a decision tree for a function
. Place the most influential variable of
at the root, and recurse on the subfunctions and on the
left and right subtrees respectively; terminate once the tree is an
-approximation of . We analyze the quality of this heuristic,
obtaining near-matching upper and lower bounds:
Upper bound: For every with decision tree size and every
, this heuristic builds a decision tree of size
at most .
Lower bound: For every and , there is an with decision tree size such that
this heuristic builds a decision tree of size .
We also obtain upper and lower bounds for monotone functions:
and
respectively. The lower bound disproves conjectures of Fiat and Pechyony (2004)
and Lee (2009).
Our upper bounds yield new algorithms for properly learning decision trees
under the uniform distribution. We show that these algorithms---which are
motivated by widely employed and empirically successful top-down decision tree
learning heuristics such as ID3, C4.5, and CART---achieve provable guarantees
that compare favorably with those of the current fastest algorithm (Ehrenfeucht
and Haussler, 1989). Our lower bounds shed new light on the limitations of
these heuristics.
Finally, we revisit the classic work of Ehrenfeucht and Haussler. We extend
it to give the first uniform-distribution proper learning algorithm that
achieves polynomial sample and memory complexity, while matching its
state-of-the-art quasipolynomial runtime
Learning about pain through observation: the role of pain-related fear
Observational learning may contribute to development and maintenance of pain-related beliefs and behaviors. The current study examined whether observation of video primes could impact appraisals of potential back stressing activities, and whether this relationship was moderated by individual differences in pain-related fear. Participants viewed a video prime in which back-stressing activity was associated with pain and injury. Both before and after viewing the prime, participants provided pain and harm ratings of standardized movements drawn from the Photograph of Daily Activities Scale (PHODA). Results indicated that observational learning occurred for participants with high levels of pain-related fear but not for low fear participants. Specifically, following prime exposure, high fear participants showed elevated pain appraisals of activity images whereas low fear participants did not. High fear participants appraised the PHODA-M images as significantly more harmful regardless of prime exposure. The findings highlight individual moderators of observational learning in the context of pain
Agnostic proper learning of monotone functions: beyond the black-box correction barrier
We give the first agnostic, efficient, proper learning algorithm for monotone
Boolean functions. Given uniformly random
examples of an unknown function , our
algorithm outputs a hypothesis that is
monotone and -close to , where
is the distance from to the closest monotone function. The running time of
the algorithm (and consequently the size and evaluation time of the hypothesis)
is also , nearly matching the lower bound
of Blais et al (RANDOM '15). We also give an algorithm for estimating up to
additive error the distance of an unknown function to
monotone using a run-time of . Previously,
for both of these problems, sample-efficient algorithms were known, but these
algorithms were not run-time efficient. Our work thus closes this gap in our
knowledge between the run-time and sample complexity.
This work builds upon the improper learning algorithm of Bshouty and Tamon
(JACM '96) and the proper semiagnostic learning algorithm of Lange, Rubinfeld,
and Vasilyan (FOCS '22), which obtains a non-monotone Boolean-valued
hypothesis, then ``corrects'' it to monotone using query-efficient local
computation algorithms on graphs. This black-box correction approach can
achieve no error better than
information-theoretically; we bypass this barrier by
a) augmenting the improper learner with a convex optimization step, and
b) learning and correcting a real-valued function before rounding its values
to Boolean.
Our real-valued correction algorithm solves the ``poset sorting'' problem of
[LRV22] for functions over general posets with non-Boolean labels
Decision Tree Heuristics Can Fail, Even in the Smoothed Setting
Greedy decision tree learning heuristics are mainstays of machine learning practice, but theoretical justification for their empirical success remains elusive. In fact, it has long been known that there are simple target functions for which they fail badly (Kearns and Mansour, STOC 1996).
Recent work of Brutzkus, Daniely, and Malach (COLT 2020) considered the smoothed analysis model as a possible avenue towards resolving this disconnect. Within the smoothed setting and for targets f that are k-juntas, they showed that these heuristics successfully learn f with depth-k decision tree hypotheses. They conjectured that the same guarantee holds more generally for targets that are depth-k decision trees.
We provide a counterexample to this conjecture: we construct targets that are depth-k decision trees and show that even in the smoothed setting, these heuristics build trees of depth 2^{?(k)} before achieving high accuracy. We also show that the guarantees of Brutzkus et al. cannot extend to the agnostic setting: there are targets that are very close to k-juntas, for which these heuristics build trees of depth 2^{?(k)} before achieving high accuracy
A Query-Optimal Algorithm for Finding Counterfactuals
We design an algorithm for finding counterfactuals with strong theoretical
guarantees on its performance. For any monotone model and
instance , our algorithm makes queries to and returns {an {\sl optimal}} counterfactual for
: a nearest instance to for which . Here is the sensitivity of , a discrete analogue of the
Lipschitz constant, and is the distance from to
its nearest counterfactuals. The previous best known query complexity was
, achievable by brute-force local search. We
further prove a lower bound of on the query complexity of any algorithm, thereby showing that the
guarantees of our algorithm are essentially optimal.Comment: 22 pages, ICML 202
- …